Search Result

Select

Massive terrain data storage based on HBase

LI Zhenju, LI Xuejun, XIE Jianwei, LI Yannan

Journal of Computer Applications 2015, 35 (7): 1849-1853. DOI: 10.11772/j.issn.1001-9081.2015.07.1849

Abstract （517）

PDF （807KB）（669）

Save

With the development of remote sensing technology, the data type and data volume of remote sensing data has increased dramatically in the past decades which is a challenge for traditional storage mode. A combination of quadtree and Hilbert spatial index was proposed in this paper to solve the the low storage efficiency in HBase data storage. Firstly, the research status of traditional terrain data storage and data storage based on HBase was reviewed. Secondly the design idea on the combination of quadtree and Hilbert spatial index based on managing global data was proposed. Thirdly the algorithm for calculating the row and column number based on the longitude and latitude of terrain data, and the algorithm for calculating the final Hilbert code was designed. Finally, the physical storage infrastructure for the index was designed. The experimental results illustrate that the data loading speed in Hadoop cluster improved 63.79%-78.45% compared to the single computer, the query time decreases by 16.13%-39.68% compared to the traditional row key index, the query speed is at least 14.71 MB/s which can meet the requirements of terrain data visualization.

Reference | Related Articles | Metrics

Select

Improvement of term frequency-inverse document frequency algorithm based on Document Triage

LI Zhenjun, ZHOU Zhurong

Journal of Computer Applications 2015, 35 (12): 3506-3510. DOI: 10.11772/j.issn.1001-9081.2015.12.3506

Abstract （505）

PDF （952KB）（412）

Save

The Term Frequency-Inverse Document Frequency (TF-IDF) algorithm does not consider the importance of index items themselves in the document when computing the weights of index terms. In order to solve the problem, the users' behaviors when reading were utilized to improve the efficiency of TF-IDF. By introducing Document Triage to TF-IDF, the Interest Profile Manager (IPM)was used to collect data about users' reading behaviors, and then the document scores were computed. Since the users' annotation was quite important in the aimed text, or reflected the users' interest. The improved term weighting algorithm named Document Triage-Term Frequency-Inverse Document Frequency (DT-TF-IDF) was proposed by introducing document scores and users' annotation to TF-IDF and giving a greater weight to annotated term. The experimental results show that the recall, the precision and their harmonic mean of DT-TF-IDF are all higher than those of the traditional TF-IDF algorithm. The proposed DT-TF-IDF algorithm is more effective than TF-IDF and has improved the accuracy of the text similarity calculation.

Reference | Related Articles | Metrics

Select

MapReduce performance model based on multi-phase dividing

LI Zhenju, LI Xuejun, YANG Sheng, LIU Tao

Journal of Computer Applications 2015, 35 (12): 3374-3377. DOI: 10.11772/j.issn.1001-9081.2015.12.3374

Abstract （557）

PDF （712KB）（328）

Save

In order to resolve the low precision and complexity problem of the existing MapReduce model caused by the reasonable phase partitioning granularity, a multi-phase MapReduce Model (MR-Model) with 5 partition granularities was proposed. Firstly, the research status of MapReduce model was reviewed. Secondly, the MapReduce job was divided into 5 phases of Read, Map, Shuffle, Reduce, Write and the specific processing time of each phase was studied. Finally, the MR-model prediction performance was tested by experiments. The experimental results show that MR-Model is suitable for the MapReduce actual job execution process. Compared with the two existing models of P-Model and H-Model, the time accuracy precision of MR-Model can be improved by 10%-30%; in the Reduce phase, its time accuracy precision can be improved by 2-3 times, the comprehensive property of the MR-Model is better.

Reference | Related Articles | Metrics

Select

Fast handover mechanism based on Delaunay triangulation for FMIPv6

LI Zhenjun LIU Xing

Journal of Computer Applications 2013, 33 (10): 2707-2710.

Abstract （507）

PDF （704KB）（596）

Save

To solve the packet loss problem caused by inaccurate prediction of New Access Router (NAR) in the Fast Handover for mobile IPv6 (FMIPv6), this paper proposed a triangulationbased fast handoff mechanism (TFMIPv6). In TFMIPv6, a triangulation algorithm was used to split the network into virtual triangle topology, and the tunnel was established among adjacent access routers. The candidate target Access Points (AP) were selected to quickly recalculate the new relay addresses for the mobile nodes, and packets were buffered in two potential NARs during handover. The experimental results illustrate that TFMIPv6 protocol achieves lower handoff latency and packet loss rate compared with FMIPv6.